21 research outputs found

    Administrative social science data: The challenge of reproducible research

    Get PDF
    Powerful new social science data resources are emerging. One particularly important source is administrative data, which were originally collected for organisational purposes but often contain information that is suitable for social science research. In this paper we outline the concept of reproducible research in relation to micro-level administrative social science data. Our central claim is that a planned and organised workflow is essential for high quality research using micro-level administrative social science data. We argue that it is essential for researchers to share research code, because code sharing enables the elements of reproducible research. First, it enables results to be duplicated and therefore allows the accuracy and validity of analyses to be evaluated. Second, it facilitates further tests of the robustness of the original piece of research. Drawing on insights from computer science and other disciplines that have been engaged in e-Research we discuss and advocate the use of Git repositories to provide a useable and effective solution to research code sharing and rendering social science research using micro-level administrative data reproducible

    dispel4py: A Python framework for data-intensive scientific computing

    Get PDF
    This paper presents dispel4py, a new Python framework for describing abstract stream-based workflows for distributed data-intensive applications. These combine the familiarity of Python programming with the scalability of workflows. Data streaming is used to gain performance, rapid prototyping and applicability to live observations. dispel4py enables scientists to focus on their scientific goals, avoiding distracting details and retaining flexibility over the computing infrastructure they use. The implementation, therefore, has to map dispel4py abstract workflows optimally onto target platforms chosen dynamically. We present four dispel4py mappings: Apache Storm, message-passing interface (MPI), multi-threading and sequential, showing two major benefits: a) smooth transitions from local development on a laptop to scalable execution for production work, and b) scalable enactment on significantly different distributed computing infrastructures. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved; they have provided demanding real applications and helped us develop effective training. The dispel4py.org is an open-source project to which we invite participation. The effective mapping of dispel4py onto multiple target infrastructures demonstrates exploitation of data-intensive and high-performance computing (HPC) architectures and consistent scalability.</p

    The Spin Structure of the Nucleon

    Full text link
    We present an overview of recent experimental and theoretical advances in our understanding of the spin structure of protons and neutrons.Comment: 84 pages, 29 figure

    Leptonic Production of Baryon Resonances

    Full text link
    In these lectures, the author focuses on the electromagnetic transition between non-strange baryon states. This sector received much attention in the early 1970's after the development of the first dynamical quark models. However, experimental progress was slow, partly because of the low rates associated with electromagnetic interactions, and partly because of the lack of guidance by theoretical models that went beyond the simplest quark models. It was also difficult for experiments to achieve the precision needed for a detailed analysis of the entire resonance region in terms of the fundamental photocoupling amplitudes over a large range in momentum transfer

    GARUDA: Pan-Indian distributed e-infrastructure for compute-data intensive collaborative science

    No full text
    corecore